Semantic Object Parsing with Graph LSTM
نویسندگان
چکیده
By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multidimensional data to general graph-structured data. Particularly, instead of evenly and fixedly dividing an image to pixels or patches in existing multi-dimensional LSTM structures (e.g., Row, Grid and Diagonal LSTMs [1][2]), we take each arbitrary-shaped superpixel as a semantically consistent node, and adaptively construct an undirected graph for each image, where the spatial relations of the superpixels are naturally used as edges. Constructed on such an adaptive graph topology, the Graph LSTM is more naturally aligned with the visual patterns in the image (e.g., object boundaries or appearance similarities) and provides a more economical information propagation route. Furthermore, for each optimization step over Graph LSTM, we propose to use a confidence-driven scheme to update the hidden and memory states of nodes progressively till all nodes are updated. In addition, for each node, the forgets gates are adaptively learned to capture different degrees of semantic correlation with neighboring nodes. Comprehensive evaluations on four diverse semantic object parsing datasets well demonstrate the significant superiority of our Graph LSTM over other state-of-the-art solutions.
منابع مشابه
Geometric Scene Parsing with Hierarchical LSTM
This paper addresses the problem of geometric scene parsing, i.e. simultaneously labeling geometric surfaces (e.g. sky, ground and vertical plane) and determining the interaction relations (e.g. layering, supporting, siding and affinity) between main regions. This problem is more challenging than the traditional semantic scene labeling, as recovering geometric structures necessarily requires th...
متن کاملImproved Graph-Based Dependency Parsing via Hierarchical LSTM Networks
In this paper, we propose a neural graph-based dependency parsing model which utilizes hierarchical LSTM networks on character level and word level to learn word representations, allowing our model to avoid the problem of limited-vocabulary and capture both distributional and compositional semantic information. Our model achieves state-ofthe-art accuracy on Chinese Penn Treebank and competitive...
متن کاملSyntax Aware LSTM Model for Chinese Semantic Role Labeling
As for semantic role labeling (SRL) task, when it comes to utilizing parsing information, both traditional methods and recent recurrent neural network (RNN) based methods use the feature engineering way. In this paper, we propose Syntax Aware Long Short Time Memory(SALSTM). The structure of SA-LSTM modifies according to dependency parsing information in order to model parsing information direct...
متن کاملDeep Multitask Learning for Semantic Dependency Parsing
We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for semantic dependency parsing, without using hand-engineered features or syntax. We then explore two ...
متن کاملGraph-based Dependency Parsing with Bidirectional LSTM
In this paper, we propose a neural network model for graph-based dependency parsing which utilizes Bidirectional LSTM (BLSTM) to capture richer contextual information instead of using high-order factorization, and enable our model to use much fewer features than previous work. In addition, we propose an effective way to learn sentence segment embedding on sentence-level based on an extra forwar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016